This project will examine consumer shopping trends and purchase behaviors using the Customer Shopping (Latest Trends) Dataset. The analysis will focus on uncovering patterns in retail purchasing across various product categories, customer demographics, and purchase channels.
“Consumer Shopping Trends: Insights into Purchase Behavior and Patterns”
URL: https://www.kaggle.com/datasets/bhadramohit/customer-shopping-latest-trends-dataset/data
License Community Data License Agreement - Sharing - Version 1.0
Expected update frequency Quarterly
Tags Business, Clothing and Accessories, Data Visualization, Science and Technology, Global
The dataset offers a comprehensive view of consumer shopping trends, aiming to uncover patterns and behaviors in retail purchasing. It contains detailed transactional data across various product categories, customer demographics, and purchase channels. Key features may include:
Customer Demographics: Customer ID, Age, Gender, Location
Transaction Details: Item Purchased, Category, Purchase Amount (USD), Size, Color, Season
Purchase Behavior: Review Rating, Subscription Status, Payment Method, Shipping Type
Promotional Information: Discount Applied, Promo Code Used
Customer History: Previous Purchases, Preferred Payment Method, Frequency of Purchases
Total Rows: 3,900
Total Columns: 19
Column Names: Customer ID, Age, Gender, Item Purchased, Category,
Purchase Amount (USD), Location, Size, Color, Season, Review Rating,
Subscription Status, Payment Method, Shipping Type, Discount Applied,
Promo Code Used, Previous Purchases, Preferred Payment Method, Frequency
of Purchases
Based on the available data, here are three proposed figures that could provide valuable insights into consumer shopping trends:
Bar Chart: Distribution of purchases across different product categories, highlighting the most popular categories.
Stacked Bar Chart: Breakdown of purchase amounts by gender and age group, showing spending patterns across demographics.
Line Plot: Seasonal trends in purchase amounts over time, revealing peak shopping periods and potential cyclical patterns.
In the following code hunk, import your data.
Scatter Plot: Relationship between review ratings and purchase amounts
# Scatter Plot
ggplot(shopping_trends, aes(x = `Review Rating`, y = `Purchase Amount (USD)`)) +
geom_point() +
theme_minimal() +
labs(title = "Relationship Between Review Ratings and Purchase Amounts",
x = "Review Rating",
y = "Purchase Amount (USD)")
BoxPlot: to compare the purchase amounts across three different product categories: Clothing, Footwear, and Accessories.
library(ggplot2)
library(dplyr)
# Filter the data for the selected categories
selected_categories <- shopping_trends %>% filter(Category %in% c("Clothing", "Footwear", "Accessories"))
# Boxplot
ggplot(selected_categories, aes(x = Category, y = `Purchase Amount (USD)`, fill = Category)) +
geom_boxplot() +
theme_minimal() +
labs(title = "Boxplot of Purchase Amounts Across Product Categories",
x = "Product Category",
y = "Purchase Amount (USD)")
Heatmap: Correlation between customer age, purchase frequency, and average transaction value
library(ggplot2)
library(dplyr)
library(tidyr)
library(reshape2)
##
## Adjuntando el paquete: 'reshape2'
## The following object is masked from 'package:tidyr':
##
## smiths
# Calculate the correlation matrix
cor_matrix <- cor(shopping_trends %>% select(Age, `Previous Purchases`, `Purchase Amount (USD)`))
# Heatmap
heatmap_data <- melt(cor_matrix)
ggplot(heatmap_data, aes(Var1, Var2, fill = value)) +
geom_tile() +
scale_fill_gradient2(low = "blue", high = "red", mid = "white", midpoint = 0) +
theme_minimal() +
labs(title = "Correlation Heatmap",
x = "Variable",
y = "Variable",
fill = "Correlation")
Bar Chart: Distribution of Purchases by Category Over Time
library(ggplot2)
library(dplyr)
# Bar Chart
ggplot(shopping_trends, aes(x = Category)) +
geom_bar(fill = "skyblue") +
theme_minimal() +
labs(title = "Distribution of Purchases Across Product Categories",
x = "Product Category",
y = "Count")
Pie Chart: Distribution of payment methods used
# Pie Chart
payment_method_counts <- shopping_trends %>%
count(`Payment Method`)
ggplot(payment_method_counts, aes(x = "", y = n, fill = `Payment Method`)) +
geom_bar(stat = "identity", width = 1) +
coord_polar("y") +
theme_minimal() +
labs(title = "Distribution of Payment Methods")
Line Plot: Seasonal trends in purchase amounts over time
# Line Plot
ggplot(shopping_trends, aes(x = Season, y = `Purchase Amount (USD)`, group = 1)) +
geom_line(stat = "summary", fun = "mean") +
theme_minimal() +
labs(title = "Seasonal Trends in Purchase Amounts",
x = "Season",
y = "Average Purchase Amount (USD)")
Stacked Bar Chart: Breakdown of purchase amounts by gender and age group
# Convert age groups to factors
shopping_trends <- shopping_trends %>%
mutate(AgeGroup = cut(Age, breaks = c(0, 20, 40, 60, Inf), labels = c("0-20", "21-40", "41-60", "60+")))
# Stacked Bar Chart
ggplot(shopping_trends, aes(x = AgeGroup, y = `Purchase Amount (USD)`, fill = Gender)) +
geom_bar(stat = "identity") +
theme_minimal() +
labs(title = "Breakdown of Purchase Amounts by Gender and Age Group",
x = "Age Group",
y = "Purchase Amount (USD)")
Density Plot: Distribution of purchase amounts by gender
# Density Plot
ggplot(shopping_trends, aes(x = `Purchase Amount (USD)`, fill = Gender)) +
geom_density(alpha = 0.5) +
theme_minimal() +
labs(title = "Distribution of Purchase Amounts by Gender",
x = "Purchase Amount (USD)",
y = "Density")
Interactive Scatter Plot: showing the relationship between review ratings and purchase amounts
library(plotly)
##
## Adjuntando el paquete: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
# Interactive Scatter Plot
scatter_plot <- plot_ly(shopping_trends, x = ~`Review Rating`, y = ~`Purchase Amount (USD)`,
type = 'scatter', mode = 'markers',
marker = list(color = 'rgba(152, 0, 0, .8)', size = 10)) %>%
layout(title = "Relationship Between Review Ratings and Purchase Amounts",
xaxis = list(title = "Review Rating"),
yaxis = list(title = "Purchase Amount (USD)"))
scatter_plot